Mining Network Logs: Information Quality Challenges

نویسندگان

  • James A. Pelletier
  • Tamraparni Dasu
چکیده

Network logs are the key to many critical functions such as network security, network monitoring and network management. They play an important role in intrusion detection and early warning for potential worm and virus attacks. However, network log data are under-utilized -largely ignored until the occurrence of an event that requires back tracking for diagnostic purposes. There are two main reasons why network logs are not subject to more rigorous analysis – the sheer volume and the inherent information quality challenges. In this paper, we use the context of classical data quality principles to outline some of the issues that we encountered, and the solutions that we devised, while working on a real network management application involving large amounts of network log data. While our discussion is centered on our case study, the problems we encounter and the solutions we devise are general and apply to a wide array of network log data and applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Problems and Challenges When Implementing a Best Practice Approach for Process Mining in a Tourist Information System

The application of process mining techniques for analyzing customer journeys seems promising for different stakeholders in the tourism domain, i.e., the tourism providers are enabled to, e.g., find nice offers or partner services and the guests can improve their holiday experience. One precondition for mining processes (high quality) logs. This paper reports on experiences in implementing a dat...

متن کامل

Graph or Relational Databases: A Speed Comparison for Process Mining Algorithm

Process-Aware Information System (PAIS) are IT systems that manages, supports business processes and generate large event logs from execution of business processes. An event log is represented as a tuple of the form CaseID, TimeStamp, Activity and Actor. Process Mining is an emerging area of research that deals with the study and analysis of business processes based on event logs. Process Minin...

متن کامل

Discovering Emerging Patterns for Anomaly Detection in Network Connection Data

Most intrusion detection approaches rely on the analysis of the packet logs recording each noticeable event happening in the network system. Network connections are then constructed on the basis of these packet logs. Searching for abnormal connections is where the application of data mining techniques for anomaly detection promise great potential benefits. Anyway, mining packet logs poses addit...

متن کامل

Wanna Improve Process Mining Results? It’s High Time We Consider Data Quality Issues Seriously

The growing interest in process mining is fueled by the increasing availability of event data. Process mining techniques use event logs to automatically discover process models, check conformance, identify bottlenecks and deviations, suggest improvements, and predict processing times. Lion’s share of process mining research has been devoted to analysis techniques. However, the proper handling o...

متن کامل

Safelog: Supporting Web Search and Mining by Differentially-Private Query Logs

Query logs can be very useful for advancing web search and web mining research. Since these web query logs contain private, possibly sensitive data, they need to be effectively anonymized before they can be released for research use. Anonymization of query logs differs from that of structured data since they are generated based on natural language and the vocabulary (domain) is infinite. This u...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005